A procedure for unsupervised lexicon learning

نویسنده

  • Anand Venkataraman
چکیده

We describe an incremental unsupervised procedure to learn words from transcribed continuous speech. The algorithm is based on a conservative and traditional statistical model, and results of empirical tests show that it is competitive with other algorithms that have been proposed recently for this task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Induction of Root and Pattern Lexicon for Unsupervised Morphological Analysis of Arabic

We propose an unsupervised approach to learning non-concatenative morphology, which we apply to induce a lexicon of Arabic roots and pattern templates. The approach is based on the idea that roots and patterns may be revealed through mutually recursive scoring based on hypothesized pattern and root frequencies. After a further iterative refinement stage, morphological analysis with the induced ...

متن کامل

Richness of the Base and Probabilistic Unsupervised Learning in Optimality Theory

This paper proposes an unsupervised learning algorithm for Optimality Theoretic grammars, which learns a complete constraint ranking and a lexicon given only unstructured surface forms and morphological relations. The learning algorithm, which is based on the ExpectationMaximization algorithm, gradually maximizes the likelihood of the observed forms by adjusting the parameters of a probabilisti...

متن کامل

Unsupervised learning of derivational morphology

We present in this paper an unsupervised method to learn suuxes and suuxation operations from an innectional lexicon of a language. The elements acquired with our method are used to build stemming procedures and can assist lexicographers in the development of new lexical resources.

متن کامل

A Simple Unsupervised Learner for POS Disambiguation Rules Given Only a Minimal Lexicon

We propose a new model for unsupervised POS tagging based on linguistic distinctions between open and closed-class items. Exploiting notions from current linguistic theory, the system uses far less information than previous systems, far simpler computational methods, and far sparser descriptions in learning contexts. By applying simple language acquisition techniques based on counting, the syst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001